Searching Protein 3-D Structures in Faster Than Linear Time
نویسنده
چکیده
Searching for similar structures from a 3-D structure database of proteins is one of the most important problems in post-genomic computational biology. To compare two structures, we ordinarily use a measure called the RMSD (root mean square deviation) as the similarity measure. We consider a very fundamental problem of finding all the substructures whose RMSDs to the query are within some given threshold, from a 3-D structure database. The problem also appears in many other fields, such as computer vision and robotics. In this paper, we propose the first algorithm that runs in faster than linear time in average. Our new algorithm runs in average-case O(m + N/m1−ε), where N is the database size, m is the query length, and ε is an arbitrary small constant such that 0 < ε < 1. It is a significant improvement over previous algorithms on the problem, considering that the best known worst-case time complexity of the problem is O(N logm), and the best known average-case (expected) time complexity of the problem was O(N).
منابع مشابه
Searching Protein 3-D Structures in Linear Time
One of the most important issues in the post-genomic molecular biology is the analysis of protein three-dimensional (3-D) structures, and searching over the 3-D structure databases of them is becoming more and more important. The root mean square deviation (RMSD) is the most popular similarity measure for comparing two molecular structures. In this article, we propose new theoretically and prac...
متن کاملSpace Efficient Data Structures for Dynamic Orthogonal Range Counting
We present a linear-space data structure that maintains a dynamic set of n points with coordinates of real numbers on the plane to support orthogonal range counting, as well as insertions and deletions, in O(( lgn lg lgn )) time. This provides faster support for updates than previous results with the same bounds on space cost and query time. We also obtain two other new results by considering t...
متن کاملGeometric Suffix Tree: A New Index Structure for Protein 3-D Structures
Protein structure analysis is one of the most important research issues in the post-genomic era, and faster and more accurate query data structures for such 3-D structures are highly desired for research on proteins. This paper proposes a new data structure for indexing protein 3-D structures. For strings, there are many efficient indexing structures such as suffix trees, but it has been consid...
متن کاملA Modification on Applied Element Method for Linear Analysis of Structures in the Range of Small and Large Deformations Based on Energy Concept
In this paper, the formulation of a modified applied element method for linear analysis of structures in the range of small and large deformations is expressed. To calculate deformations in the structure, the minimum total potential energy principle is used. This method estimates the linear behavior of the structure in the range of small and large deformations, with a very good accuracy and low...
متن کاملافزایش بیومس و رشد میکروجلبک Dunaliella تحت تاثیر تیمار وانیلین
One of the important goals of plant physiology in water ecosystems, is biomass and growth increment of green microalgae which led to access to their valuable production. In this regard, many factors are able to influence the increment and decrement of algal biomass. In this research different levels of vanillin (C8H8O3), as 0 (control), 10, 25, 40, 50, 70, 90 and 100 mg. L-1, were investigated ...
متن کامل